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Abstract 

In this paper we propose a new wider class of hypergeometric heavy tailed priors that are 
given as the convolution of a Student-t density for the location parameter and a Scaled Beta2 
t^i prior for the variance. These priors have heavier tails than Student-t prior, and the variances 

have a sensible behavior both at the origin and at the tail, making it suitable for objective 
analysis. Since the representation of our proposal is a scale mixture, it is suitable to detect 
sudden changes in the model. Finally we propose a Gibbs sampler using this new family of 
priors for modeling outliers and structural breaks in Bayesian dynamic linear models. It is 
f^f, _ clearly more suitable than the almost universal use of Inverted Gamma's for the variances. 

^ ! 1 Introduction 

^^ 

Strong criticisms against the almost universal use of "vague" Inverted Gamma prior distributions 
has appeared in Gelman (2006), who also propose half-Student priors for the scale parameter, r, 
in hierarchical models. On the other hand, Pericchi (2010) propose to use the Beta Distribution 
of the Second Kind, (or Beta 2 distribution) as a sensible general replacement of Inverted- 
C^ I Gammas as priors for scale parameters, for hierarchical models. The Beta 2 distribution for the 

r scale is: 

Perez & Pericchi (2009) use the theory of Regularly Varying (RV) functions showed in An- 
drade & O'Hagan (2005) for checking the robustness of this prior. Pericchi & Perez (2010) 
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introduce the Cauchy-Beta2 prior, on which the location conditional on scale is Cauchy and the 
scale is Beta2. For example, for the case on which p = q = 1, it is shown that the marginal prior 
for the location parameters, fulfils the desiderata that obeys a horseshoe density, (Carvalho, 
Poison & Scott (2010) Theorem 1.) i) unbounded at origin and ii) as heavy tails than a Cauchy, 
in fact heavier tails, and furthermore it has an explicit form. The scaled Beta2 distribution 
can be defined as a scale mixture of Gammas for the square of the scale as follow (see Perez & 
Pericchi (2009)): 

r^ ~ Gamma(p, /3/p) (2) 

p ~ Gamma(g, 1) (3) 

where Gamma(a, h) denotes the Gamma distribution: 

p{xW, b) = r(a)/3« ^"~^ exp{-x//3} a > 0, 6 > 0, (4) 

with /? the scale parameter. Therefore the scaled Beta2 prior for the square scale is the 
following: 



2x r(p + q) 1 \/3 , 



/5y 

For precisions A = l/r^, we assign the scaled Beta 2 as 

... r(g + p) , (/JA)"^! _ 

typically the hyper-parameters p, q are fairly small, for example p = q = 1, and /3 quite small, 
obtaining in this way: a bounded density at the origin, flat tails and an vague prior distribution. 

Here we model the square of the scale of a Cauchy (or more generally a Student-t) as a 
scaled Beta 2 prior, and show that the marginal for the location can be written in explicit form. 
For particular values of the hyper-parameters, the marginal is found analytically. This strategy 
has several advantages among them it is suitable " quasi- non-informative" distribution generally 
for square scale parameters in Bayesian Statistics. Also, our scheme lends itself naturally to a 
simple Gibbs-Sampling procedure, not adding substantial complication to the Inverted Gamma 
prior analysis, but improving its performance. 

This paper is organized as follows: in Section 2, we show the new family of heavy tailed priors 
and illustrate their qualities. In section 3 we show the Gibbs sampler proposed for dynamic linear 
models. In section 4 the potential of our proposal is illustrated in a popular example of the series 
of quarterly gas consumption in the UK from 1960 and 1986. Some closing concluding remarks 
are presented in Section 5. 



2 A new class of hypergeometric heavy tailed priors 

In this paper we consider the Student-t density coupled with the scaled Beta2 prior to the 
square of the scale, in order to achieve robustness with respect to the prior, get sensible prior 
inputs with quasi-non-informative parameters and to get analytical result even in closed forms 
for particular values. 

Result: Let 9 ~ Student-t (/x, r, v) where v are the degrees of freedom, fi the location and r the 
scale of the Student-t density: 



niew'^) 




where k, = ^^^,^. Therefore 



'kp'iu/i9-fi)'i+^/^2Fl{p + q,q+l/2,{v + l)/2+p+q,l-/3i^/{9-fif) \i 9 ^ 



Tr{9) = < 



M, 



^kiBe{p-l/2,q + l/2)/{f3y^Be{p,q)) if 9 = fi. 

with k = kiBe{q + 1/2, p -|- v/2)/Be{p,q). Where Be{a,b) denotes the beta function and 
2Fl{a,b,c,z) denotes the hypergeometric function (see 15.1.1 of Abramowitz & Stegun (1970)). 
The tt{9) prior is called here the Student-t-Beta(i;,p,q,/3) and it is a new wider class of hyper- 
geometric heavy tailed priors (see proof in the appendix). In order to illustrate the qualities of 
these priors, we show a particular case of this class of priors. 

2.1 Example: the Student-t-Beta2(l,l,l,/3) prior 

For V = p = q = 1 the Student-t-Beta2-Beta2(l,l,l,/3) prior is the following: 



-(^) = 7 — r, — ^ ^^) 



(8) is called here the Student-t-Beta2(l,l,l,/3) and (8) is a novel distribution, to the 
best of our knowledge. We can find (8) using the identities 15.3.3 and 15.1.13 of Abramowitz 
&: Stegun (1970) and also we can show easily that (8) is a proper prior. In order to compare the 
Cauchy, Normal and Student-t-Beta2(l,l,l,/3) priors we make a match of the quartiles equal ±1. 
Therefore, the scale for the Normal is 1.47 and for both Cauchy and Student-t-Beta2(l,l,l,/3) 
priors the scale is 1. Figures 1 and 2 display that the student-t-Beta(l,l,l,/3) prior has tails 
heavier than the Cauchy prior. 



Figure 1: Comparison of the Student-t-Beta2(l,l,l,l) ,Cauchy(0,l), Normal(0,2.19) priors. 
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Figure 2: Comparison of the tails of the Student-t-Beta2(l, 1,1,1) ,Cauchy(0,l), Normal(0,2.19) 
priors. 
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3 Model Specification and modeling outliers and structural breaks 

A Dynamic Linear Model (DLM) is specified (see Prado & West (2010)) by the set of equations: 

yt = Ftet + ut ut^N{{),Vt), (9) 

t = 1, ...T. The specification of (8) is given by the prior distribution for the initial state ^o- 
This is assumed to be normally distributed with mean rriQ and variance Cq. yt and 9t are m 
and n-dimensional random vectors and Ft, Gt, Vt and Wt are real matrices of the appropriate 
dimension. In our applications yt is the value of an univariate time series at time t, while 9t is 
an unobservable state vector. The original proposal for using heavy tailed priors for modeling 
and detecting outliers is considered in West (1984). The idea put into action in Petris, Petrone 
& Campagnoli (2010) is to represent the distributions as a scale mixture, and check when 
the latent variable is too big or too low, far from one. On the other hand, Petris et al. (2010) 
propose a Bayesian approach for modeling outliers in dynamic linear models replacing the normal 
distribution of each component I't and ujt with a scale mixture of normal distributions, leading 
to a Student-t distribution to obtain a model that accounts for possible outliers and structural 
breaks (not only in the observation process but also in the state process). Petris et al. (2010) 
use a Gibbs sampler in their proposal and priors are specified for the degrees of freedom of the 
Student-t distribution. In our view, although a combination of Gibbs and Metropolis Sampling 
can be implemented, the clever model proposed by Petris et al. (2010) is overly complex, slow 
and difficult to analyze and elicit. Our proposal is to use the Student-t-Beta(t;,q,p,4) (using the 
Beta2 prior for the precision A = l/r^) prior for modeling outliers in DLM in order to account 
outliers in the observations (i.e. abrupt changes in the state vector) of the specify model. Wt^i 
denotes the ith diagonal element of Wt^i, i = l,...,n the hierarchical Student-t-Beta(i;,q,p,-4) 
prior can be summarized in the following display: 

\y\q ~ Gamma(g, (/3/9y)"^), Ae,i|g ~ Gamma(g, {/3pg^tJ~^), 

bjy^t ~ Gamma(z;/2, 2/u), uje^t^ ~ Gamma(f /2, 2/f ), 

Py ~ Gamma(p, 1), pQ^t^ ~ Gamma(p, 1), 

For each t, the posterior distribution of w^^j (i.e. uje,ti) contains the information of outliers and 
abrupt changes in the states. Values oiujy^t (i-e. '^e,ti) smaller than one indicate possible outliers 
or abrupt changes in the states (See Petris et al. (2010)). A Gibbs sampler is implemented using 
the posterior distribution of parameter and states of the model specified above. For example 
the full conditional^ for \y is given by: 



^The dots on the right-hand side of the conditional vertical bar in 7r(Ay|...) denote that for every other random 
variable in the model except \y 



vr(A,|...)«nA,f exp|-^^(yt-Fi0,)2|.A^"iexp{-/3p,A,}, (10) 



hence, 



T 1 

2"' 2' 



Xy\... r^ Gsimma[q+ —,-SSy* + I3py] (11) 



where S^y* = Y2t=i^y,iiyt ~ ^tGt)^- Now, we make a summary of all the full conditional 
distributions. 

f T I \ f T I 

\y\... ~Gamma {q + —, -SSy* + Ppy 1 , A6i,i|... ~Gamma {q + —, -SSg^ + /3pe,t, 

where SS*,, = Y.Li ^e,tM. " {GtOt^i)i? for i = 1, 2, ...,n; 

, ^ /i; + l t; + Aj,(yi-Fi0i)2\ /i; + l ^; + A^(0i, - Ae,i(Gt0t-i)*)'\ 
ujy,t\- ~Gamma ( -^—, ^-^ I , ^e,u\- ~Gamma ( -^—, ^ I 

Py\... ~Gamma (p + g, /3Ay + 1) , Pe,ti\--- ~Gamma(p + q,PXe,i + 1) , 

for i = 1, ..., n and t = 1, ...T. Given all the unknown parameters, the states of the DLM are 
generated using the forward filtering backward sampling (FFBS) given in Fruwirth-Schnatter 
(1994) which is practically a simulation of the smoothing recursions. 

Now, we show two different applications. The DLM were fitted using the R software package 
dim recently developed by Giovanni Petris (see Petris (2010)). Also, this package implement the 
FFBS algorithm. 



3.1 Example 1: a numerical example of the annual Consumer Price Index in 
Puerto 

To illustrate in a numerical example the use of our proposal we have the annual Consumer Price 
Index in Puerto Rico on a log-scale in the Figure 2. A look at the time series plot shows that 
it has possible outliers and structural breaks and it is necessary to have a model that account 
this changes. The Figure 2 displays that there is an outlier at 2000. The trend shows changes 
in 1990 and 2001 and the slope has a change in 2003. 

We implemented the Gibbs sampler proposed in the IPC data on the log scale with a local 
linear trend model, or linear growth model, which should give a reasonable fit. The linear growth 
model is the following: 
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Figure 3: Annual Consumer Price Index in Puerto Rico on a log-scale, 1984-2010. 



fit = fit~i +6-1 +^t,i, Wi,i ~ N{0,al^), 

with uncorrelated errors ft, LOt^i and u}t^2- This is a DLM with: 



(12) 
(13) 
(14) 
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We choice the degrees of freedom equals to z^ = 4. The choice of z^ = 4 is not new, in fact 
different authors have recommended the Student-t prior with four degrees of freedom in order 
to obtain robustness in Bayesian statistics (see for example Gelman, Carlin, Stern & Rubin 
(1995)). 

On the other hand, the scaled Beta2 prior for the precision A = l/r^ is the following: 






r(g)r(p)'^(l + /3A)P+'?' 



A>0 



(15) 
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Figure 4: Outliers and structural breaks in the annual Consumer Price Index in Puerto Rico 



and for /3 small we have heavy tails for robust inference. Therefore we use a Student-t- 
Beta(4,l,l,— = 10000) for the Gibbs Sampler proposed. The residuals in the bottom Figure 4 

are given by it = Vt — E{F9t\yi:T)- We can see the mild outlier at 2000 with E{ijjy^t\yi:t) = 0.81. 
On the other hand, the trend of the series (top panel of the Figure 4.) shows some changes, the 
most abrupt in 2001 with an estimated (^e,ti = 0.75. The slope has different jumps, the most 
dramatic one in 2003 with E{ujg^t^\yi-T) = 0.75. 



3.2 Example 2: quarterly gas consumption in the UK 

In this section we consider the series of quarterly gas consumption in the UK from 1960 and 
1986 analyzed in Fruwirth-Schnatter (1994), West & Harrison (1997) and Petris et al. (2010) to 
mention some. In the latest reference an interesting detection of outliers is presented, but with 
an extremely complicated model, which however assumes that the scales are modeled trough a 
Inverted Gamma priors. We first show that with a natural model to detect outliers but that use 



Inverted-Gammas is unable to detect the obvious change in the series. 

On the other hand, here we show that a far simpler and easier to understand and implement 

model is able to detect the changes, when the ScaledBeta2 is assumed, instead of Inverted 

Gammas. 

A plot series on the log scale shows some changes in the seasonal factor in the third quarter of 

1970 and a DLM obtained as a quarterly seasonal factor model plus a local linear trend model 

could fit this data reasonably well. The observations (F) and system matrices of the model are: 



F= [ 1 1 
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The unknown parameters are the observations variance Vt and three elements for Wt'. 



Wt= [ 



a 



/.,*' '^e,t 



0, 



where a^^t, (^It 



■"^ and a^ j are the unknown variances of the level of the series, the slope of the 
linear trend and the seasonal respectively. We implemented our Gibbs sampler proposed and it 
is compared with the objective Bayesian strategy: 



v; 



-A^iCJ. 



y^y,t, 



Xy ~ Gamma(10000, 10000), 



^v,t 



Gamma(2,l/2), 



W, 



t,i 



-Xe^iUJe,ti, 



Xe^i ~ Gamma(10000, 10000), 
uje^ti ~ Gamma(2, 1/2), 



note that in summary this approach is to use a Student-t with four degrees of freedom and 
a non-informative Gamma for modelling the outliers and changes in the states. 

Figure 5. displays the posterior means of the ojy^t and we,*^, t = 1,...,108 and z = 1,2,3 
using the Student-t (4) - non-informative Gamma approach. It is clear that using a Student- 
t(4)-Gamma(10000, 10000) as prior, for modelling the series of quarterly gas consumption in the 
UK, we obtain that there are no outliers and structural breaks for this series. 

Figure 6. displays the posterior means of the ujy^t and ujg^n, t = 1,...,108 and i = 1,2,3 

using the Student-t-Beta(4,l,l,— = 10000). We can see that there are different results with 

the two approaches. Using the Student-t-Beta(4,l,l,— = 10000) we have the expected results 

for modelling the changes in the dynamic linear models. According to the Bayesian approach 
proposed there are no observational outliers, excluding the mild outlier in the third quarter of 
1983 with an estimated of ujy^t of 0.83. We can see that this approach indicates that the trend 
and its slope are stable. There are a lot of structural changes in the seasonal component the 
most extreme one occurring in the third quarter of 1971 with E{u}g^t3\yi:t) = 0.025. 
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Figure 5: UK gas consumption: posterior means of the aj^'s using the Student-t(4)- 
Gamma(10000,10000) 
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Figure 6: UK gas consumption: posterior means of the ujts using the Student-t-Beta(4,l,l,— 
10000) 
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In Figure 7 we have the estimation of the 95% credible intervals for the unobservable seasonal 
and trend components. We can see that the credible interval for the seasonal component is wider 
beginning the seventies because it is a period of high variability. These results are very similar 
than the founded with a model more complex in Petris et al. (2010). 
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Figure 7: UK gas consumption: trend and seasonal component, with 
using the Student-t-Beta(4,l,l,- = 10000) 
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4 Conclusions 

This paper follows up the proposal by Pericchi (2010) and Perez &: Pericchi (2009) to use the 
Scaled Beta 2 distribution as a sensible general replacement of Inverted-Gammas as priors for 
scale parameters, for hierarchical models. 
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Here we show that if the square of the scales of a Cauchy (or more generahy Student-t) are 
assumed to be distributed as a ScaleBeta2, a general result for the marginal of the location 
is obtained, with even closed forms results for particular hyper-parameters. Furthermore, our 
scheme lends itself naturally to a simple Gibbs-Sampling procedure, not adding substantial 
complication to the Inverted Gamma prior analysis, but improving its performance. We suggest 
these priors as a suitable robust objective analysis for Dynamic Linear Models. The original 
proposal by Pericchi & Perez (2010) of modeling the scale (as opposed to the square of the 
scales) leads also to a sensible analysis, and for particular values it yields an explicit "Horse- 
Shoe" prior, with a pole at zero. The proposal here is very similar in its properties (without a 
pole at zero but a sizeable finite peak at the origin) but it is simpler and easier to implement. 

A Appendix 

We have that T:{e) = /q°° 7r(6'|r2)7r(T2)(ir2, clearly 

making a change of variable z = 1/{[vt)/{9 — /_i)^ + 1). Then for 9 ^ fi 

<e) = ^^^ j\l - z)^/2+^-i(l - z{l - Pv/{9 - /.)2))-(P+«)dz (17) 

therefore 

7r(0) = kp''u/{9 - iiY+^l^2Fl{p + q,q + 1/2, {v + l)/2 + p + q,l - /3i^/{9 - iif) (18) 

see 9.111 of (Gradshteyn &; Ryzhink (1965)). For 9 = fi we have that 

k. r°° /t \-ip+i) 

n{9) = —Pr—, / r^-'/' ( 5 + 1 ) dT (19) 

^ ' l3PBeip,q)Jo \P ) ^ ' 

making a change of variable z = \/{t / 13 + 1) then 

therefore 

tt{9) = k{Be{p - 1/2, q + l/2)/(/3V2Be(p, q)) (21) 

see 8.380 of (Gradshteyn & Ryzhink (1965)). 
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